Start RStudio and open the egrid-lab-setup.R file.
Use the mouse to highlight all the code and press CTRL+ENTER or the Run button in the upper-right of the editor window.
These commands will load the necessary libraries, create a dataframe from the eGRID spreadsheet, select the important variables for the lab, and rename them for ease of use and understanding. You will already have run a copy of them in the egrid-lab-setup.R file.
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.3 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(readxl)
library(sf)
## Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
library(tmap)
# sets working directory to R source file location
setwd("c:/temp/class0921")
message(paste0("Working directory = ", getwd()))
## Working directory = c:/temp/class0921
# loads custom functions
source("./functions/functions1a.R")
egrid1 = read_excel("./data/egrid2018_data_v2.xlsx",
sheet="PLNT18",skip=1) %>%
tolow2()
egrid2 = egrid1 %>% mutate(StateAbbr = pstatabb,
PlantName = pname,
NamePlateCapMW = namepcap,
Fuel = plfuelct,
NetGenGWh = plngenan/10^3,
CO2tonsm = replace_na(plco2eqa,0)/10^6) %>%
select(StateAbbr, PlantName, Fuel,
NamePlateCapMW,
NetGenGWh,
CO2tonsm,
lat,lon)
st2 = egrid2 %>%
filter(StateAbbr == "GA") %>%
filter(lat > 0 & lon < 0)
st3 = st2 %>% filter(NamePlateCapMW >= 1000)
st2geo = st_as_sf(st2, coords = c("lon", "lat"),
crs = 4326) %>%
select(PlantName, everything())
stsolargeo = st2geo %>%
filter(Fuel == "SOLAR")
gacounty = st_read("./gisdata/gacounty.shp") %>%
select(NAME, everything())
## Reading layer `gacounty' from data source `C:\temp\class0921\gisdata\gacounty.shp' using driver `ESRI Shapefile'
## Simple feature collection with 159 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension: XYZ
## Bounding box: xmin: -85.60516 ymin: 30.35784 xmax: -80.84055 ymax: 35.00066
## z_range: zmin: 0 zmax: 0
## Geodetic CRS: NAD83
istates = st_read("./gisdata/interstates.shp")
## Reading layer `interstates' from data source
## `C:\temp\class0921\gisdata\interstates.shp' using driver `ESRI Shapefile'
## Simple feature collection with 689 features and 7 fields
## Geometry type: MULTILINESTRING
## Dimension: XY
## Bounding box: xmin: -158.0901 ymin: 21.27311 xmax: -67.78119 ymax: 64.88822
## Geodetic CRS: WGS 84
The focus of this lab is using ggplot2 for basic visualization of power plant level electricity and CO2 emissions data from the EPA eGRID dataset. eGRID stands for “Emissions & Generation Resource Integrated Database” and is available from
Initial code from above has already loaded the 2018 EPA Egrid dataset from the egrid2018_data_v2.xlsx spreadsheet. That spreadsheet is located in a subdirectory named “data” one level down from your working directory. Throughout the course we will store input data in its own folder so we can easily find it and update it whenever new data becomes available. As long as the data format has not changed, our R code should work exactly the same way with the updated data.
The setup code has also selected a subset of varibles and renamed them for convenience, and created two additional datasets:
st2 has only Georgia power plants st3 has only large (over 1.0 GW) Georgia power plants
Type each of the following commands into the console to confirm your current working directory and explore basic information about the st3 dataset:
getwd()
## [1] "C:/courses/cca/class0921"
summary(st3)
## StateAbbr PlantName Fuel NamePlateCapMW
## Length:12 Length:12 Length:12 Min. :1099
## Class :character Class :character Class :character 1st Qu.:1342
## Mode :character Mode :character Mode :character Median :1734
## Mean :2220
## 3rd Qu.:3287
## Max. :4520
##
## NetGenGWh CO2tonsm lat lon
## Min. : 92.23 Min. : 0.00000 Min. :31.93 Min. :-85.04
## 1st Qu.: 4066.98 1st Qu.: 0.05145 1st Qu.:33.12 1st Qu.:-84.94
## Median : 9053.31 Median : 2.85259 Median :33.38 Median :-84.69
## Mean : 9755.88 Mean : 4.49050 Mean :33.33 Mean :-83.89
## 3rd Qu.:14912.00 3rd Qu.: 4.53004 3rd Qu.:33.55 3rd Qu.:-83.06
## Max. :19959.13 Max. :18.39730 Max. :34.71 Max. :-81.18
## NA's :1
head(st3)
glimpse(st3)
## Rows: 12
## Columns: 8
## $ StateAbbr <chr> "GA", "GA", "GA", "GA", "GA", "GA", "GA", "GA", "GA", "~
## $ PlantName <chr> "Bowen", "Edwin I Hatch", "Harllee Branch", "Jack McDon~
## $ Fuel <chr> "COAL", "NUCLEAR", "COAL", "GAS", "GAS", "COAL", "GAS",~
## $ NamePlateCapMW <dbl> 3540.4, 1721.8, 1746.2, 3202.0, 1376.6, 3564.0, 1099.2,~
## $ NetGenGWh <dbl> 13619.914, 14403.550, NA, 16799.053, 9053.313, 15420.44~
## $ CO2tonsm <dbl> 14.78503521, 0.00000000, 0.00000000, 6.82052236, 3.7665~
## $ lat <dbl> 34.12560, 31.93420, 33.19500, 33.82390, 32.34780, 33.06~
## $ lon <dbl> -84.92220, -82.34470, -83.29830, -84.47580, -81.18170, ~
View(st3)
Remember:
NamePlateCapMW: nameplate plant capacity in megawatts of power
NetGenGWh: annual net generation in gigawatt-hours of energy
You’ll just have to memorize these.
Large number abbreviations
For reference, in 2019 Georgia retail electricity sales were
For a state population of 10.6 million, that’s about 13 MWh per person
We can use tmap to visualize electricity generation across Georgia:
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(gacounty) +
tm_polygons(col="palegreen") +
tm_shape(istates) +
tm_lines(col="tomato4") +
tm_shape(st2geo) +
tm_dots("Fuel", size="NetGenGWh")
## Legend for symbol sizes not available in view mode.